Compression of line spectral frequency parameters using the asynchronous interpolation model
نویسندگان
چکیده
We apply an asynchronous interpolation model (AIM) to line spectral frequency trajectories. AIM represents speech transition features as crossfading between basis vector features, governed by individual interpolation weights per feature component. Basis vectors are initialized from demiphone labels, and then optimized using a local reconstruction error. Using a small diphone acoustic inventory, we reduce the number of parameters by using dimensionreduced latent space weights and a vector quantized pool of basis vectors. The highest compression rate of 1:11 resulted in a log spectral distortion of 4.83 dB.
منابع مشابه
Unit-selection text-to-speech synthesis using an asynchronous interpolation model
We describe the Asynchronous Interpolation Model, which represents speech as a composition of several different types of feature streams that are computed using asynchronous interpolation of neighboring basis vectors, according to transition weights. When applied to the acoustic inventory of a concatenative Text-to-Speech synthesizer, the model eliminates concatenation errors and affords opport...
متن کاملInterpolation properties of linear prediction parametric representations
In this paper, interpolation of linear predictive coding (LPC) parameters in terms of the following representations is investigated: linear prediction coefficient representation, reflection coefficient representation, log-arearatio representation, arc-sine reflection coefficient representation, cepstral coefficient representation, line spectral frequency representation, autocorrelation coeffici...
متن کاملLow Resource TTS Synthesis Based on Cepstral Filter with Phase Randomized Excitation
In this paper we present the acoustic synthesis of a low resource Text-To-Speech (TTS) system based on a 7th order cepstral filter. The excitation signal is designed in frequency domain by a two parameter model. This model is able to generate the excitation signal for both, voiced and unvoiced segments. The sets of filter coefficients represent the speech units and are stored in a compressed fo...
متن کاملRobust Transmission of Speech LSFs Using Hidden Markov Model-Based Multiple Description Index Assignments
Speech coding techniques capable of generating encoded representations which are robust against channel losses play an important role in enabling reliable voice communication over packet networks and mobile wireless systems. In this paper, we investigate the use of multiple description index assignments (MDIAs) for loss-tolerant transmission of line spectral frequency (LSF) coefficients, typica...
متن کاملPower SystemAnalysis for Nonsinusoidal Steady State Studies Based onWavelets
In this paper power system model is represented in a new domain that relates to Multi-Resolution Analysis (MRA) space. By developing mathematical model of elements in this space using Galerkin method, a new alternative method for power system simulation in nonsinusoidal and periodic conditions is developed. The mathematical formulation and characteristics of new proposed space is expressed. Als...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010